313 research outputs found
On Identifying Disaster-Related Tweets: Matching-based or Learning-based?
Social media such as tweets are emerging as platforms contributing to
situational awareness during disasters. Information shared on Twitter by both
affected population (e.g., requesting assistance, warning) and those outside
the impact zone (e.g., providing assistance) would help first responders,
decision makers, and the public to understand the situation first-hand.
Effective use of such information requires timely selection and analysis of
tweets that are relevant to a particular disaster. Even though abundant tweets
are promising as a data source, it is challenging to automatically identify
relevant messages since tweet are short and unstructured, resulting to
unsatisfactory classification performance of conventional learning-based
approaches. Thus, we propose a simple yet effective algorithm to identify
relevant messages based on matching keywords and hashtags, and provide a
comparison between matching-based and learning-based approaches. To evaluate
the two approaches, we put them into a framework specifically proposed for
analyzing disaster-related tweets. Analysis results on eleven datasets with
various disaster types show that our technique provides relevant tweets of
higher quality and more interpretable results of sentiment analysis tasks when
compared to learning approach
How to evaluate multiple range-sum queries progressively
Decision support system users typically submit batches of range-sum queries simultaneously rather than issuing individual, unrelated queries. We propose a wavelet based technique that exploits I/O sharing across a query batch to evaluate the set of queries progressively and efficiently. The challenge is that now controlling the structure of errors across query results becomes more critical than minimizing error per individual query. Consequently, we define a class of structural error penalty functions and show how they are controlled by our technique. Experiments demonstrate that our technique is efficient as an exact algorithm, and the progressive estimates are accurate, even after less than one I/O per query
- …